摘要 :
A prominent technique for reducing the memory footprint of Spiking Neural Networks (SNNs) without decreasing the accuracy significantly is quantization. However, the state-of-the-art only focus on employing the weight quantization...
展开
A prominent technique for reducing the memory footprint of Spiking Neural Networks (SNNs) without decreasing the accuracy significantly is quantization. However, the state-of-the-art only focus on employing the weight quantization directly from a specific quantization scheme, i.e., either the post-training quantization (PTQ) or the in-training quantization (ITQ), and do not consider (1) quantizing other SNN parameters (e.g., neurons' membrane potential), (2) exploring different combinations of quantization approaches (i.e., quantization schemes, precision levels, and rounding schemes), and (3) selecting the SNN model with a good memory-accuracy trade-off at the end. Therefore, the memory saving offered by these state-of-the-art to meet the targeted accuracy is limited, thereby hindering processing SNNs on the resource-constrained systems (e.g., the IoT-Edge devices). Towards this, we propose Q-SpiNN, a novel quantization framework for memory-efficient SNNs. The key mechanisms of the Q-SpiNN are: (1) employing quantization for different SNN parameters based on their significance to the accuracy, (2) exploring different combinations of quantization schemes, precision levels, and rounding schemes to find efficient SNN model candidates, and (3) developing an algorithm that quantifies the benefit of the memory-accuracy trade-off obtained by the candidates, and selects the Pareto-optimal one. The experimental results show that, for the unsupervised network, the Q-SpiNN reduces the memory footprint by ca. 4x, while maintaining the accuracy within 1% from the baseline on the MNIST dataset. For the supervised network, the Q-SpiNN reduces the memory by ca. 2x, while keeping the accuracy within 2% from the baseline on the DVS-Gesture dataset.
收起
摘要 :
Due to the excessive use of cloud-based machine learning (ML) services, the smart cyber-physical systems (CPS) are increasingly becoming vulnerable to black-box attacks on their ML modules. Traditionally, the black-box attacks are...
展开
Due to the excessive use of cloud-based machine learning (ML) services, the smart cyber-physical systems (CPS) are increasingly becoming vulnerable to black-box attacks on their ML modules. Traditionally, the black-box attacks are either transfer attacks requiring model stealing, or score/decision-based gradient estimation attacks requiring a large number of queries. In practical scenarios, especially for cloud-based ML services and timing-constrained CPS use-cases, every query incurs a huge cost, thereby rendering state-of-the-art decision-based attacks ineffective in such settings. Towards this, we propose a novel methodology for automatically generating an extremely fast and imperceptible decision-based attack called FaDec. It follows two main steps: (1) fast estimation of the classification boundary by combining the half-interval search-based algorithm with gradient sign estimation to reduce the number of queries; and (2) adversarial noise optimization to ensure the imperceptibility. For illustration, we evaluate FaDec on the image recognition and traffic sign detection using multiple state-of-the-art DNNs trained on CIFAR-10 and the German Traffic Sign Recognition Benchmarks (GTSRB) datasets. The experimental analysis shows that the proposed FaDec attack is 16x faster compared to the state-of-the-art decision-based attacks, and generates an attack image with better imperceptibility for a much lesser number of iterations, thereby making our attack more powerful in practical scenarios. We open-sourced the complete code and results of our methodology at https://github.com/fklodhi/FaDec.
收起
摘要 :
Capsule Networks (CapsNets), recently proposed by the Google Brain team, have superior learning capabilities in machine learning tasks, like image classification, compared to the traditional CNNs. However, CapsNets require extreme...
展开
Capsule Networks (CapsNets), recently proposed by the Google Brain team, have superior learning capabilities in machine learning tasks, like image classification, compared to the traditional CNNs. However, CapsNets require extremely intense computations and are difficult to be deployed in their original form at the resource-constrained edge devices. This paper makes the first attempt to quantize CapsNet models, to enable their efficient edge implementations, by developing a specialized quantization framework for CapsNets. We evaluate our framework for several benchmarks. On a deep CapsNet model for the CIFAR10 dataset, the framework reduces the memory footprint by 6.2x, with only 0.15% accuracy loss. We will open-source our framework at https://git.io/JvDIF.
收起
摘要 :
Approximate Computing (AC) has emerged as a means for improving the performance, area and power-/energy-efficiency of a digital design at the cost of output quality degradation. Applications like machine learning (e.g., using DNNs...
展开
Approximate Computing (AC) has emerged as a means for improving the performance, area and power-/energy-efficiency of a digital design at the cost of output quality degradation. Applications like machine learning (e.g., using DNNs-deep neural networks) are highly computationally intensive and, therefore, can significantly benefit from AC and specialized accelerators. However, the accuracy loss introduced because of approximations in the DNN accelerator hardware can result in undesirable results. This paper presents a novel method to design high-performance DNN accelerators where approximation error(s) from one stage/part of the design is "completely" compensated in the subsequent stage/part while offering significant efficiency gains. Towards this, the paper also presents a case-study for improving the performance of systolic array-based hardware architectures, which are commonly used for accelerating state-of-the-art deep learning algorithms.
收起
摘要 :
Bio-signals exhibit high redundancy, and the algorithms for their processing are inherently error resilient. This property can be leveraged to improve the energy-efficiency of IoT-Edge (wearables) through the emerging trend of app...
展开
Bio-signals exhibit high redundancy, and the algorithms for their processing are inherently error resilient. This property can be leveraged to improve the energy-efficiency of IoT-Edge (wearables) through the emerging trend of approximate computing. This paper presents XBioSiP, a novel methodology for approximate bio-signal processing that employs two quality evaluation stages, during the pre-processing and bio-signal processing stages, to determine the approximation parameters. It thereby achieves high energy savings while satisfying the user-determined quality constraint. Our methodology achieves, up to $19 \times$ and $22 \times$ reduction in the energy consumption of a QRS peak detection algorithm for 0% and <1% loss in peak detection accuracy, respectively.
收起
摘要 :
Generative Adversarial Networks (GANs) have gained importance because of their tremendous unsupervised learning capability and enormous applications in data generation, for example, text to image synthesis, synthetic medical data ...
展开
Generative Adversarial Networks (GANs) have gained importance because of their tremendous unsupervised learning capability and enormous applications in data generation, for example, text to image synthesis, synthetic medical data generation, video generation, and artwork generation. Hardware acceleration for GANs become challenging due to the intrinsic complex computational phases, which require efficient data management during the training and inference. In this work, we propose a distributed on-chip memory architecture, which aims at efficiently handling the data for complex computations involved in GANs, such as strided convolution or transposed convolution. We also propose a controller that improves the computational efficiency by pre-arranging the data from either the off-chip memory or the computational units before storing it in the on-chip memory. Our architectural enhancement supports to achieve 3.65x performance improvement in state-of-the-art, and reduces the number of read accesses and write accesses by 85% and 75%, respectively.
收起
摘要 :
Adversarial examples have emerged as a significant threat to machine learning algorithms, especially to the convolutional neural networks (CNNs). In this paper, we propose two quantization-based defense mechanisms, Constant Quanti...
展开
Adversarial examples have emerged as a significant threat to machine learning algorithms, especially to the convolutional neural networks (CNNs). In this paper, we propose two quantization-based defense mechanisms, Constant Quantization (CQ) and Trainable Quantization (TQ), to increase the robustness of CNNs against adversarial examples. CQ quantizes input pixel intensities based on a “fixed” number of quantization levels, while in TQ, the quantization levels are “iteratively learned during the training phase”, thereby providing a stronger defense mechanism. We apply the proposed techniques on undefended CNNs against different state-of-the-art adversarial attacks from the open-source Cleverhans library. The experimental results demonstrate 50%–96% and 10%–50% increase in the classification accuracy of the perturbed images generated from the MNIST and the CIFAR-10 datasets, respectively, on commonly used CNN (Conv2D(64, 8×8)-Conv2D(128, 6×6)-Conv2D(128, 5×5) - Dense(10) - Softmax()) available in Cleverhans library.
收起
摘要 :
Most of the data manipulation attacks on deep neural networks (DNNs) during the training stage introduce a perceptible noise that can be catered by preprocessing during inference, or can be identified during the validation phase. ...
展开
Most of the data manipulation attacks on deep neural networks (DNNs) during the training stage introduce a perceptible noise that can be catered by preprocessing during inference, or can be identified during the validation phase. There-fore, data poisoning attacks during inference (e.g., adversarial attacks) are becoming more popular. However, many of them do not consider the imperceptibility factor in their optimization algorithms, and can be detected by correlation and structural similarity analysis, or noticeable (e.g., by humans) in multi-level security system. Moreover, majority of the inference attack rely on some knowledge about the training dataset. In this paper, we propose a novel methodology which automatically generates imperceptible attack images by using the back-propagation algorithm on pre-trained DNNs, without requiring any information about the training dataset (i.e., completely training data-unaware). We present a case study on traffic sign detection using the VGGNet trained on the German Traffic Sign Recognition Benchmarks dataset in an autonomous driving use case. Our results demonstrate that the generated attack images successfully perform misclassification while remaining imperceptible in both “subjective” and “objective” quality tests.
收起